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® Process for making genes encoding random polymers of amino acids. 

(g) A method for making a synthetic gene encoding a random polymers of 

D^detennU^ed amino acid constituents, comprises the polymerisation of small oligonucleotide d"P'«'*«- 

■S.eToTs^thesis can also be used as part of a method for identifying amino-acid polymers wrth b.olog.cal 
and/or immunological activity. 
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PROCESS FOR MAKING GENES ENCODING RANDOM POLYMERS OF AMINO ACIDS 



Background of the Invention 



Copolymer 1 (COP-l) is a synthetic polypeptide analog of myelin basic protein (MBP). which is a 
natural component of the myelin sheath. It has been suggested as a potential therapeutic agent for multple 
sclerosis (Eur. J. Immunol. [1971] 1:242; and J. Neurol. Sci. [1977] 31:433). '"t^'^f COP-; ^s an 
immunotherapy for multiple sclerosis stems from obsen/aBons first made in the ^ J^^' ^.f " 
components such as MBP prevent or arrest experimental autoimmune encephalomyelitis (EAE). EAE is a 
disease resembfing multiple sclerosis that can be induced in susceptible animals. 

COP-1 was developed by Drs. Sela. Amon. and their co-workers at the Weizmann Institute (Rehovot 
Israel). It was shown to suppress experimental allergic encephalomyelitis (EAE) (Eur. J. Immunol. [1971] 
1:242-248- US Patent No. 3.849.550). More recently. COP-1 was shown to be beneficial for patients with 
the exacertjaiing-remitting form of multiple sclerosis (N. Engl. J. Med. [1987] 317:408-H4). Patients treated 
with dally injections of COP-1 had fewer exacerbations and smaller increases in their disability status than 

COP-1 ?s^?IlSture of polypeptides composed of alanine, glutamic acid, lysine, and tyrosine In a molar 
ratio of approximately 6:2:5:1. respectively. 11 is synthesized by chemically polymerizing the four amino 
acids forming products with average molecular weights of 23.000 daltons. Although the resulbng polypep- 
tides are comprised of the same amino acid components, they differ with respect to their ammo acid 
sequences In fact, there are 10'°" possible ways to assemble a 23.000 dalton polypeptide composed o 
alanine glutamic acid, lysine and tyrosine in the designated ratios. Purification of one or even a small 
number of distinct COP-1 polypeptides from chemically-synthesized COP-1 is not possible. , . 

Studies evaluating COP-1 's efficacy have been hindered somewhat by inconsistent batches of COP-1. 
Also it is not known which of the amino acid polymer(s) Is responsible for the biological activity of COP-1. 
Other random sequence amino acid copolymers related to COP-l have been chemically synthesized and 
tested for the ability to suppress experimental allergic encephalomyelitis (Eur. J. Immunol. [1973] 3:273^ 
Immunochemistry [1976] 13:333). Biological activity was observed in EAE assays using COP-1 related 
polymers in which one of the following changes occurs: tyrosine is replaced by tryptophan: or glutamic acid 
is replaced by aspartic acid; or tyrosine is excluded. ^ , .« 

We have developed procedures for synthesizing genes encoding polypeptides composed of specific 
amino adds, but having random amino acid sequences. The amino acid composition of the polypeptides is 
dictated by the set of codons incorporated in the synthetic genes. Ukewlse. the size of the polypeptdes Is 
controlled by synthesizing genes of specific lengths. 

Brief Summary of the Invention 

The subject' InvenBon concerns' a method for synthesizing genes which encode random polymers of 
amino acids A further aspect of the invention is the identification of certan polypeptides which are 
expressed by the synthetic genes and which have high levels of biological activity. The general method- 
oloay of the subject invention is outlined in Figure 1 . . . x .■ 

A critical step in the novel process of the subject invention is the polymanzation of small 
oligonucleotide duplexes. Preferably, the oUgonudeotide duplexes consist of a multiple of three^nucleotidas. 
The length of the synthetic genes can be controlled through the use of adaptors speafic for the 5 and 3 
ends of the oligonucleotides. Further, the composition of the resultant polypeptides can be varied, with 
respect to the relative proportions of the amino acid constituents, by varying the proportons of input 

olioonucleotida duplexes. . . _ , ^ -n.- 

The synthesis of polypeptides similar to COP-1 exemplifies the procedures of the subject inventon^ The 
initial stepin the procedure is the synthesis of genes which code for polypeptides "'"^'^""g "'P^^*^^- 
min d amino acid constituents. These genes are then cloned in an expression vector and introduced into | 
coll such that each recombinant bacterial colony contains n COP-1 gen . To generate a mixture f COP-1 
HSTypeptides (analogous to the chemically synthesized product) we produce COP-1 polypeptides from a 
pool of recombinant bacterial colonies containing COP-1 gene sequences, e.g.. 1000 colonies. 



EP 0 383 620 A2 



-me efficacy of the pool of recombinant COP-1 polypeptides Is tested m experimental allergjc 
encephalomy^s (EAE) assays. If effective. tt.e pool of colonies exhibiting actvty .s ftir^er subdivided 
le7 l1^^S7oO co^nlos).^^ the polypeptides from these smaller pools are tested By sequent,alh. 
SSoSating and selecting the most active pools, we identify individual recombinant COP-1 Po'VP^^^^^ 
S groups Of polypeptides with biological activity in EAE assays equal to or higher than chemically 
SlSIeS^S COP?, ^he opportunity to characterize homogeneous. Individual COP-1 polypeptdes is unique 

*°*5,e'suS invention also concerns the synthetic genes and the polypeptides produced by the 
methods disclosed herein. Advantageously, the procedures of the subject invention can be used to produce 
DoCTDtides which may be useful in preventing, arresting, or controlling demyelinating disorders such as 
mSfSfei^sifr preferred copolymer according to the subject invention consists substantally of 
zSe iSrZld eiSier glutamic or aspartic acid. Polymers of any length or molecular weigh can be 
i,^mesized iin^ the procedures of the subject invention. Another preferred copolymer further Includes 
£ tyrosine 0? tryptophan. More specifically, a preferred copolymer may ' 
™ic aSd. aSd tJSsine. and have a molecular weight between about 5.000 and 50,000 daltons. Further, 
the method of the subject inver^tion can also be used to make fusion proteins. 



Brief Description of the Drawings 



Rgure 1 depicts the general methodology for synthesizing random genes and identifying polypep- 
tides ;;oJogi ^^^^^ ^^^^^^ 33,,, encoding random-sequence po.ypep- 

Figure 3 shows all possible 3-amino acid combinations and their percent occurrence In COP-1. 
Figure 4 shows 9-nucleotide duplexes and adaptors for COP-1 gene synthesis. 
Figure 5 shows the construction and cloning of synthetic COP-1 genes. hof«ofin 
Figure 6 provides the sequence analysis of one synthetic gene revealing proper junctions between 

duplexej^^^ 7 is a Western btot showing the fusion protein produced by four different clones. 
Figure 8 shows variations of random-gene synthesis using small duplexes. 

Figure 9 shows synthesis of single-stranded DNA encoding random sequence ammo acid POly<nerz 
Figure 10 show7a phosphoramidite trinucleotide for random gene synthesis using a ONA syn- 

thesizer. . _ , ____ , _ 

Figure 11 shows the DNA and amino acid sequences of rCOP-i-7T. 

Figure 12 shows the DNA and amino acids sequences of rCOP-1-19. „»,„,t„H «r 

Hg^re 13 shows the average EAE scores of disease-induced guinea pigs wh.ch were untreated or 
treated with myelin basic protein. rCOP-1-19. or rCOP-1-77. 

Detailed Description of the Invention 

One strateov for synthesizing genes encoding random-sequence polypeptides is outlined in Rgure 2. 
The exLl^JTsS^e syrithesis of genes composed of two DNA duplexes. Oligonucleotides compns- 
ITo tS^TupLes^ synthesLd and annealed. Each DNA duplex has the same "sticky ends" represented 
by X lid xTn Rgure 2. The duplexes are mixed together, sticky ends align, and the ends are Joined In an 
eJ^tic reTctSn pmducing long segments of DNA (genes). Since the sticky ends on P'^;;/''^^ 
sSn^e Sxes can ali^r* and figate in any order. Thus, a series of genes composed of the same 
dtSexeTbSt S ;^^orders are produced. The polypeptides encoded by these genes will have similar 

^t:tyZr:,°rdrs~S™^^ significant modinca«ons of the ^roce^e. 
preJously V^nstn;ct genes encoding specific proteins. Rrst. random sequence ^^'^^^ ^^^'"'^^^^ 

smSldigonucleotide duplexes because more sequence variation is produced by mixing the 
Srof sS;^r,^i.e t?^ '^e ^ In contrast, long oligomers of 30 to 60 nucleotides are employed 

to^nmes^ a gen l^JZs a defined amino add sequence. Secondly, to construct r^^om-sequence 
?en« *e sScky^eJds on e^^ duplex must be Identical so that the dupl xes can be joined together in any 
l?der F* gle^ coSspondIng to defined amino acid sequences, the duplexes must l.gat in a f«ed order. 
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To achieve this ordering, the sticky ends at each Junction must be unique. "Hius. the process of th subject 
invention is unique in that random-sequence genes are synthesized using oligonucleotide duplexes 
encoding small segments of amino acids, and the stlclcy ends on each duplex are the same. 

Our procedure for synthesizing genes encoding recombinant COP-1 polypeptides entails using 
oligonucleotide duplexes encoding segments of 3 amino acids. All possible pennutalions of the 3 amino 
acid segments comprised of the four COP-1 amino acids and their percent occurrences in COP-1 are 
shown in Figure 3. To make recombinant COP-1 genes we have synthesized ongonucleotides correspond- 
ing to the coding and noncoding strands for some of the 3 amino acid segments (f=igura 4). The 
oligonucleotides are phosphorylated at the s' ends. Complementary pairs of oligonucleotides are annealed, 
forming duplexes with the 3' nucleotide extending on each strand: adenosine on the coding strand, and 
thymidine on the noncoding strand. Adenosine and thymidine base pair with one another thus ensuring that 
the duplexes are joined directionally, that is. coding strands to other coding strands. Since the duplexes 
have the same nucleotide extensions, they can align in any order. When duplexes con-esponding to all 
sixty-four 3-amino add blocks are mixed and ligated, COP-1 genes with all possible sequences are 

produced. » . ' . 

To control the length of the synthetic genes, we have included adaptors specific for the 5 and 3 ends. 
One strand of each duplex adaptor is not phosphorylated (see Figure 4). As a result, ligation products 
terminate at the hydroxylated ends. By varying the ratio of the adaptor duplexes in the reaction, we can 
control the length of the synthetic genes. The adaptors serve the second function of adding specific 
extensions required to directionally clone the ligation products into the vector (Figure 5). The relative 
proportions of amino acid consttuents In the random polymers can also be modulated by mixing the 
oligonucleotide duplexes in different ratios. For example, duplexes can be added to ligations in proportons 
dictated by the percent occurrence of the con-esponding amino acid segments in COP-1 (Figure 3). 

The number of different duplexes incorporated determines the sequence complexity of the synthetic 
genes For example. COP-1 genes can be constructed using fewer than 64 duplexes but not all sequence 
combinations will occur. The amount of sequence complexity required will depend on the application of the 

polypeptides. _ . , 

One complication arises In producing completely random COP-1 amino acid sequences. For duplexes 
to have the same extensions, the 3' nucleotide on the coding strands must be the^ same. Codons for 
alanine glutamic acid, and lysine can end with adenosine: however, for tyrosine, the last nucleotide is either 
cytidine or uridine. TTius. duplexes encoding three amino acid segments ending in tyrosine (fourth column 
on Figure 3) will have different extensions than the other duplexes. This limitation can be overcome by 
making a second noncoding strand for each duplex with an extension of guanosine or adenosine. Becajse 
this solution requires significantly more DNA synthesis, we have elected to exclude duplexes con^sponding 
to three amino acid segments ending In tyrosine. . . 

Once the synthetic-genes are made, they are cloned into a suitable expression vector and transfen-ed to 
a host capable of expressing the polypeptides. The host may be bacterial or eukaryoUc cells. With bacteria, 
cells are grown under conditions that pennit fomiation of recombinant colonies. Each colony will contain 
and express one synthetic gene. TTie polypeptides expressed by culture from specific colonies are Isolated 
and tested for the relevant biological or immunological activity. For example. COP-1 polypeptides are tested 

for the ability to suppress EAE. , .u i ^^^^ 

To screen a large number of polypeptides for activity, it may be advantageous to pool the polypeptides 
before testing. Pools of polypeptides can be generated by either combining colonies and isolating the 
polypeptides produced by ttie mixed culture or by culturing individual colonies, isolating the polypeptides 
from each culhjre. and tiien combining tiie purified polypeptides. By testing pools of polypepudes. it is 
possible to more quickly determine which of the colonies express biologically or immunologically active 
polypeptides. For example, H polypeptides from 100 colonies do not exhibit activity, tiiese colonies can be 
elirninated from further investigation. If. on the other hand, the mixture of polypeptides does exhibit ttie 
desired activity, ttien sequential subsets of the pooled polypeptides can be tested until tiie active 
polypeptides, or mixture of polypeptides, is identified. , . «,= 

Tests for some biological properties may be performed direcUy on tiie colonies, thus eliminatingthe 
need for Isolating polypeptides. For example, reactivity of polypeptides witti antisera can be assessed in 
colonies grown and lysed on niti-ocellulose filters. 



Materials and Methods 



EP 0 383 620 A2 



^=3 



Synthesis and Phosphorylation of Oligonucleotides. The coding and noncoding o«9onudMtdes for 
eac h duplex a rSTy nthesized by the ^osphite triester method with an Applied Biosystems Model 380 DNA 
synthesizer. The s' ends of oPigonuclBOtides are phosphorylated on the DNA synthesizer using (2-{2-{4.4 - 
dimethoxytrityloxy)ethyIsulfonyl)ethyK2-cyanoethyl)N.N-diisopropyl) phosphoramidite from Glen Research. 

5 Oligo nucleotides . The phosphorylated form of the oligonucleotide is separated from the 

crudTFnJS?rb7el6ctrophoresis through a 20% aoylamide gel containing 7 M urea. Oligomers are eluted 
from excised gel slices and desaited on Sep-Pak C-18 cartidgas. Separation of the S phosphorylated 
oligomer from hydroxylated forms is critical because hydroxylated oligomer causes the ligation reactions to 

10 terminate prematurely, yielding very smafl products. 

Anneallnq and Ugation of Ofig onudeotldes . Mixtures containing 1 nmol each of a coding and com- 
plem-5?;5?7^oJ^H?diHi-l?ind. 100 mM Tris-HCI. pH 7.8. and 0.1 mM ethylenediarninetetraacetic acid 
(EDTA)lrr50 micronters are heated to 80' C in a 600 ml water bath. The reactions are allowed to cool 
slowly (2 hours) to room temperature. The temperature is decreased to 4 C over one hour by adding ice to 
Srwater bath For ligation, ecjual aliquots of the annealed duplexes are mixed and the soluton is adjusted 
Tc^nSn lTmM Mg'ci. i Z adenosine triphosphate (ATP). 1 mM DTT and 

The final concentration of Tris-HCI is 66 mM (from the annealing reactons). The total concentraton of 9- 
mer duplexes is 10 pmol/microliter. Annealed adaptors are included in the reactions at a concentraton 
which will produce ligation products of the correct size and with termini that are compatble with sites in the 
cloning vector. To obtain a maximum yield of synthetic genes within the correct * °' 

ligation reactions are set up with increasing ratios of adaptor duplex:9-mer duplex (e.g.. 1:50 |-150. 1-300). 
Se reactions yielding products in the desired size range are pursued. We have detemi.ned ttiat adding 1 

Ugation reactions are in 75 microliters with 600 units of T4 DNA ligase (New England B.olabs. Beverly. 
« MA) ^^e reaction proceeds at le'C for 16-20 hours. After ligation, polyethylene glycol is rer^o-^d by 
extraction with 3 volumes of chtorofom,. The products are concentrated by ethanol precipitation and 

resuspended in 10 microliters. . 

size Selection of UgaUon Products . The concentrated reaction products are elecfrophoresed on 4% 
NuSii^ ^TG^arFse-(FMC BlS^cts. Roclcford. ME) gels. Products are detected by staining wjh 
30 ethidfum bromide and the region corresponding to the desired size range (400-600 nucleoBdes) is exos^ 
ly^etic genes if 400-600 nucleotides encode polypeptides of 15.000-23.000 daltons. We have sel^ted 
genes of approximately this size because COP-I polypeptides within this range were previously tested in 
chemical trials. The agarose plug containing the synthetic DNA is stored at -20 C. 
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The pREV 21 plasmid can be constructed from a plasmid pBGI. Plasmid pBQI can be isolated from 
its E. coli host by well known procedures, e.g.. using cleared lysate-isopycnic density gradient P'o^^^^- 
and-thTTike Plasmid pBGI was deposited in the E. coll host MS371 with the Northern Regional Research 
LaSo^JoJy NRRL U.S. Department of Agriculture-p-SS-rfa. Illinois. U.S.A) on November 1. 1984 and v«s 
aLlgnTd «,e acc«^^^ number NRRL B-15904. pREV 2.1 was constructed from plasm d e'^P-^^^" 
pRB/ 2.2. Uke pBGl. pREV2i expresses inserted genes behind the E, coIi promoter. The differences 
between pBGI and pREV2.2 are the following: 

1 qREV2 2 lacks a hinctional replication of plasmid (rop) protein. 

2. PREV2.2 has the trpA transcription terminator IniSrted into the Aatll site. This sequence insures 

transcription termination of over-expressed genes. _ w,„„„„hani^ni whBraas nBGI 

3. PREV2.2 has genes to provide resistance to ampicillin and chloramphenicol, whereas pBGl 

orovides resistance only to ampicillin. • ' 

4 PREV2.2 contains a sequence encoding sites for several restnction endonucleases. 
The following procedures were used to make each of the four changes I^ted atwve: 

1a!^5 ug of plasmid pBQI was restricted with Ndel. which gives two fragments of approximately 2160 

oTur^DNA fn,m the digestion mixture, after inactivation of the Ndel. was tr«ated wi* T4 DNA 
ligase under .Seditions that favor Intramolecular ligation (200 ul reaction volume "S'nfl standard T4 ligase 
r«^on c^ndWons [New England Biolabs. Beverly. MA]). Intramolecular Ogation of the 3440 base pair 
fragment g^^^^^^^^^^ P'-mid. The Jgation mixture was ,ransfom,ed Into -^'Pj-J^"" 

E «n jSi03 (avail^le from New Erigland Biolabs) and ampicillin resistant clones were selected by 
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standard procedures. 

1c The product plasmid. pBGI N, where the 2160 base pair Ndel fragment is deleted from pBGI. 
was selected by preparing plasmid from ampidllin resistant clones and determining the restriction digestion 
patterns with Ndel and Sail (product fragments approximately 1790 and 1650). This deletion inactivates the 
5 rop gene that corrtrols plasmid replication. 

— 2a. 5 ug of pBGI N was then digested with Eco RI and Bel l and the larger fragment approximately 
2455 base pairs, was Isolated. 

2b. A synthetic double stranded fragment was prepared by the procedure of Itakura et al. (Itakura. K.. 
J.J. Rossi, and R.B. Wallace [1984] Ann. Rev. Biochem. 53:323-356, and references therein) with the 
10 following structure: 

5' GATCAAGCTTCTGCAGTCGACGCATG 

3' TTCGAAGACGTCAGCTGCGTACGCCT 

AGGCCATGGGCCCTCGAGCTTAA 5' 

20 CGGATCCGGTACCCGGGAGCTCG 3' 

This fragment has Bell and EcoRI sticky ends and contains recognition sequences for several restriction 

endonucleases. . 
25 2c. 0.1 ug of the 2455 base pair EcoRI-Bc1l fragment and 0.01 ug of the synthetic fragment were 

joined with T4 DNA ngase and competent cells of strain JM103 wore transformed. Cells harboring the 
recombinant plasmid. where the synthetic fragment was inserted into pBGI N between the Bcl^l and EcoRI 
sites, were selected by digestion of the plasmid with Hpal and EcoRI. The diagnostic fragment sizes are 
approximately 2355 and 200 base pairs. This plasmid is called pREVI. ^ 
30 2d. 5 ug of pREVI were digested with Aatll, which cleaves uniquely. 

2e. The following double-stranded fragment was synthesized: 
S' CGGTACCAGCCCGCCTAATGAGCGGGCI I I II I I IGACGT 3^ 
S' TeCAGCCATGGTCQGGCQQATTACTCGCCCGAAAAAAAAC 5 

This fragment has Aatll sf cky ends and contains the bpA transcription termination sequence. 

2f. 0.1 ug of"Aatll digested pREVI was ligated with 0.01 ug of the synthetic fragment In a volume of 

20 ul using T4 DNAligase. 

2g. Cells of sfrain JM103. made competent were transformed and ampicillin resistant clones 

selected. j , . ■■ 

0^ 2h Using a Kpnl, EcoRI double restriction digest of plasmid isolated from selected colonies, a cell 

^ 40 containing the correcTco'nstruction was isolated. The sizes of the Kpnl, EcoRI generated fragments are 

approximately 2475 and 80 base pairs. This plasmid is called pREViTT and contains the trpA transcnption 

terminator. 

3a. 5 ug of PREVITT, prepared as disclosed above (by standard methods) was cleaved with Ndel 
and XmnI and the approximately 850'base pair fragment was Isolated. 
45 ~3b. 5 ug of plasmid pBR325 (BRU Galthersburg, MD), which contains the genes confemng 
resistance to chloramphenicol as well as to ampicillin and tetracycline, was cleaved with Bcl^l and the ends 
blunted with Klenow polymerase and dexoynucleotides. After inactivating the enzyme, the mixture was 
treated with Ndel and the approximately 3185 base pair fragment was isolated. This fragment contains the 
genes for chioramphenicol and ampidlUn resistance and the origin of replication, 
so 3c. 0.1 ug of the Ndel-XmnI fragment from pREVITT and the Ndel-Bcll fragment from pBR325 were 

ngated in 20 ul with T4~DNAligaso and the mixture used to transfonn competent cells of strain JM103. 
Cells resistant to both ampidllin and chloramphenicol were selected. 

3d Using an EcoRI and Ndel double digest of plasmid from selected dones, a plasmid was selected 
giving fragment sIzeiTf appro)dS^?tely 2480. 1145. and 410 base pairs. This is called plasmid pREVITT/chI 
55 and has genes for resistance to both ampidllin and chloramphenicoL 
4a. The following double-stranded fragment was synthesized: 
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Mlul EcoRV Qal BamHI 

5' CGAACGCGTGGCCGATATCATCGATGG 

3' GCTTGCGCACCGGCTATAGTAGCTACC 
Sail Hindin Smal 
ATCCGTCGACAAGCTTCCCGGGAGCr 3' 
TAGGCAGCTGTTCGAAGGGCCC 5' 

10 

This fragment with a blunt end and an SstI sticky end. contains recognition sequences for severaJ 
restriction er^e ^^^^^^^^ nucleotides from the Bel. 

" site) and 'sstl (which cleaves within the multiple cloHiHg site). The larger fragment, approximately 3990 base 
pairs, was isolated^from^^^^ from pREVITT/chl and 0.01 ug of the syntheUc fragment were 

trsatad with T4 DNA llqase in a volume of 20 U.I. . , _,. j 

4^ ™s mlxture'was transformed Into strain JM103 and ampicillin resls^t clones were sajct«d- 
'° Z Plasmid was purified from several clones and screened by d.geston w-tti Mlul or Oal 

RecomWnant^lones with le new multiple cloning site will give one fragment when d.gested w.th e.ther of 
these enzvmes. because each cleaves the plasmid once. 

these enzyme ^ ^^^^^^ ^.^ ■^""l VT"^ ^oH^d 

with Hpa. and Pvull and isolating the 1395 base pair fragment, cloning it Into the Srnal s>te of mp18 and 
'® sequiFHng it bTdideoxynucleotide sequencing using standard methods. 

■ 4a. This plasmid is called pREV2.2. i,.,_irt 

Pla/mid pREV 2.2 can be isolated from its E. coli host by well known procedures. This plasmid was 
denoiS in S I con host JM103 with th^NSFthem Regional Research Uboratory (NRRU as. 
Dep jSent Of Agrillk Peoria. Illinois. USA) on July 20. 1986 and was assigned^the access.on number 

'''' Rasmfd°^'REV 2.1 was constructed using plasmid pREV 2.2 and a synthetic oligonucleotide. An 

"""^'r. %:Z^;nT^TS."-:.V:. SSon en^ymes N™. and BamHI and the 4 Kb fragment is 
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isolated from an agarose gel. 

2. The following double-strand oligonucleotide is synthesized: 

5' CGAACGCGTGGTCCGATATCATCGATG 3' 
3' GCTTGCGCACCAGGCTATAGTAGCTACCTAG 5' 



3. The fragments from 1 and 2 are ligated In 20 ul using T4 DNA ligase. transformed into competent 
F eoll calls and chtoramphenicol resistant colonies are isolated. ■ 
■ - — 4.1iJSd ciones'are identified that contain the oligonucleotide from 2. spanning the r^^n from the 
Nrul site to the BamHI site and recreating these two restriction sites. This plasmid temied pREV 2 1 . 
— c lnino ttie-mation Products into an Expression Vector. The expression vector is digested to yield 

" LV'^SoI^^t'TSnA TJel^ '^^Jt added aS'^e Reaction "is Incubated ovemlght at 16' C. 

■'''l^oSr-^'eTa^pt'S illustrate procedures. Including me best mode, for P-Jcinj «he 
5S invenSn ?Ses?examples should not be construed as limiting. All percentages are by weight and all 
solvent mixture proportions ar by volume unless othewfise noted,. 
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Example 1 - Strategy for Synthesis of Random Sequence Genes 

Model studies using several olIgonucleotldB duplexes were performed to assess this method for 
synthesizing genes encoding polypeptides of predetermined amino add composition but random se- 
5 quences. The synthetic genes were analyzed with respect to size, ligation junctions, composition, sequence, 
and levels of expression. 

Size. Synthetic genes within broad size ranges are produced by varying the ratio of adaptors to 9-mer 
duplexes. We are able to select genes within more Dmited size ranges by resolving the ligation products on 
agarose gels and excising gel slices containing products of a certain length. Using these procedures, genes 
10 in the following three size ranges have been Isolated: 75-150, 280-320, and 400-600 nucleotides. 

Ligabon junctions. The synthetic genes have been sequenced to demonstrate that the 9-mer duplexes 
are joined end to end and without insertion or deletion of any nucleotides. Con-ect junctions are necessary 
for maintaining the reading frame and thus producing genes that encode polypeptides of the expected 
amino acid composition. Sequence analysis revealed that the junctions between the duplexes are coaect 
IS (Rgure 6). 

Composition . To control the amino add composition of the encoded polypeptides. th« synthetic genes 
must contain the duplexes added to the ligation reactions. The results from three gene synthesis 
experiments demonstrated that the synthesized genes are composed of the duplexes included in the input 
steps of the synthesis. 

20 Sequence. Since each duplex can ligate to any other, the order of the 9-mer duplexes In the synthetic 
genes should be random. This randomness is demonstrated in the Example shown in Figure 6. The 
synthetic gene shown in this example Is composed of duplexes encoding the following amino acid 
segments: KKA. EAE, KAK and YKK. 

Expression. To measure expression levels, synthetic genes are cloned into vectors such that the 

as polypeptides are expressed either as fusions with heterologous vector-derived peptide sequences, or as 
nonfusion polypeptides. The levels of expression of the fusion products can be readily measured by 
Westem blot analysis using antisera directed against the vector-derived portion of the fusion protein. Figure 
7 demonstrates the expression of four COP-1 -containing fusion polypeptides. 

\ 

30 

Example 2 - Variations for Synthesizing Random Sequence Genes with Small DNA Duplexes 

DNA duplexes of other lengths can be used In an approach similar to that descritied for duplexes of 9 
nucleotides. Since three nucleotides code for one amino acid, strands of 3. 6. 12, 15, or 1 8 nucleotides can 

35 be annealed fomilng duplexes that code "for small blocks of amino adds (see Figure 8). IVlore sequence 
variation occurs by mixing small rather than large duplexes. 

In addition, terminal extensions of more than one nucleotide can be employed. Rgure 8 shows an 
example using duplexes of 12 nudeofdes and extensions of 3 nucleotides. XXX represents any codon; the 
codons in the duplexes are varied to produce polypeptides of the desired amino acid composition. This 

40 approach restricts the polypeptide sequences since the amino add encoded by the duplex junctions - 
alanine (Ala) in this example - Is repeated every fourth amino add. Also, in another variation of the subject 
invention, extensions of s' nucleotides can be employed instead of the s' extensions illustrated throughout 
this report 

45 

Example 3 - Synthesis of Single-Stranded Random Sequence Genes 

Gene synthesis using DNA duplexes results In double stranded genes that are ready for cloning into an 
expression vector. An alternative strategy entails producing single-stranded, random sequence genes. The 

so appncation of this method to "COP-1 is illustrated in Figure 9. Three-nucleotide oligomers corresponding to 
codons for each of the amino adds in COP-1 are synthesized, mixed In appropriate ratios, and chemically 
polymerized in solution to produce long single-stranded COP-1 genes. The complementary strand of DNA 
is made enzymattcally using reverse transcriptase or DNA polymerase. The double-stranded DNA is 
prepared for cloning by digestion and repairing the ends of the molecules. 

55 Single-stranded, random sequence genes could also be made by performing the polymerization step on 
a DNA synthesizer. Typically, synthetic DNA Is assembled one nucleotide at a time using phosphoramidite 
nucleotide precursors. We developed a strategy for synthesizing single-stranded DNA In three nucleotide 
segments ("codons") by using phosphoramidite Wnucleotides (Rgure 10). The use of 3-nucleotidB building 
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blocks instead of single nucleotides Is necessary to ensure that only specific codons. those con-esponding 
to the phosphoramidlte trInucleotidBS. would occur In the synthetic genes. To test this strategy, we 
commissioned the custom synthesis of a phosphoramidito trinucleotide. We observed polymerization of the 
trinucleotide on the DNA synthesizer. 



Example 4 - Expression of Random Sequence Genes in Vector Producing Fusion Proteins 

SynUietic random sequence genes can be cloned Into gene fusion vectors so that the expressed 
Dolvoeptides are comprised of a vector-derived polypeptide linked to the random sequence polypeptide. 
The application of this method to COP-1 is described here. SynUietic COP-1 genes are cloned mto tiie 
exoression vector denoted pREV 2.1 within the polylinker site. Upon expression from pRev 21a 
polypeptide is synthesized comprising ttie amino-terminal portion (approximately 25 to approximately 45 
amino acids) of the bacterial protein linked by a peptide bond to Uio COP-t polypeptide. 

We have tested twelve different COP-1 genes for expression in pRev 2.1. Fus.on polypeptides were 
found to be expressed from ten of tiie twelve constructs. Figure 7 is a Western blot demonsfrating tine 
fusion proteins produced from four different clones, as detected by binding of antisera specific for ttie 
bacterial portion of 'the fusion protein. Detection of fusion proteins in the expected size range using antisera 
specific for the vector-derived portion Is dependent on tiie presence of a COP-1 gene sequence since tiie 
bacterial peptide alone Is much smaller. _» . j„^.,«j 

Clones 1 2. and 3 each produce fusion polypeptides comprised of 34 ammo acids of the vector^enved 
protein (approximately 3.900 daltons) and approximately 130-200 amino acids encoded by tiie random 
sequence genes (15.000-23.000 daltons). Based on migration ttirough SDS gels, tiie mo ecular weights of 
tiie fusion polypeptides are in the correct range. Clone 4 produces lower levels of several smaller 
polypeptides which may be generated upon degradation of Uie largest species. 

?^e amino acids which are at Uie junction of the vectorKlerived and COP-1 polypeptides are encoded 
by the 5' oligonucleotide adaptor duplex. The adaptor duplex can be designed » encode a methionine 
residue between the bacterial protein and tiie COP-1 sequences. In ttiis case, tiie COP-1 polypeptides can 
be released from the fusion protein by treatinent wltti cyanogen bromide which cleaves on the carboxyl 
terminal side of mettiionine residues. BoUi forms of COP-1 (the fusions and tiie free polypeptides) are 

^^^^n l°dd"on to^pf^S?!'. other fusion vectors can be employed to express COP-1 fusion polypeptides, 
and other strategies can be employed to release ttie COP-1 polypeptides from tiie bacterial proteins. For 
example, to Improve the expression of rCOP-1 polypeptides in E. coli. genes coding f°^fCOP-1-77 and 
rCOP-1-19. were subcloned from pREV 2.1 to pBG3-2AN. a plasmid used to express Protein A. PBG3-2AN 
has been deposited as described in U.S. Patent No. 4.691.009. The deposit was made on November 20^ 
1984 and given tiie accession number of NRRL B-15910. The rCOP-1 genes were isolated from tiie qBEV 
21 recombinant plasmids by digestion witii Ncol and EcoRI. The Ncol site occui^ in the 5 linker 
emoloved in cloning the rCOP-1 genes and the-eToRI site is in pREV 2.1 downstream of tiie rCOP-1 gene. 
After CSon wiS the restric^on enzymes. Ih? ends of tiie rCOP-1 genes are blunted wrth deox- 
yribonucleotides and Klenow fragment DNA fragments containing the '^^OP-I genes are .sclated from 
aoarose aels pBG3-2AN is digested wtih Nhel. treated witti phosphatase and ttie ends of the DNA are 
blunted v^itii deoxyrlbonucleotides and Kleno-^Tfragment. After ligation and plating. pBG3-2AN recornbinants . 
bearinq rCOP-1 genes in tiie correct orientation are identified by DNA sequence analysis. The resulting 
pSds encode' fusion proteins consisting of ^-glucuronidase. Protein A. and jCOP-l sequencej^^^^ 
mettiionine residue occurs between Uie Protein A and rCOP-1 sequences, onginatang from ttie 5 linker 
^eq^nr-n o der ttiat tiie COP-1 polypeptide may be cleaved from ttie fusion pn^teln. The nucleotide and 
^^ino add sequences for rCOP-1-77 and rCOP-1-19 are shown in Rgures 11 ^"^Z^. respecJvely^OPM- 
19 contains oBqonuclaotide duplexes encoding tiie following ammo acid segments: YKK. AAE, kak. hm. 

aS IS and KA^ rCOP.1-77 contains oligonucleotide duplexes encoding ttie following ammo 
add se^ents: YKK. EAE. KAK AAK. and AAA. The N-termlnal alanine residue In each sequence is left 
behind foltowing CNBr cleavage of ttie fusion protein. 

The Invention should not be limited Va ttie examples descnbed above. 

Example 5 - Expression of Random Sequence Genes in Non-Fusion Vectors 

Synttietic genes can be cloned Into expression vectors such tiiat ttio polypeptide products are not fused 
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to vector-tlerived protein sequences. As in fusion vectors, these non-fusion vectors contain the appropriate 
transcriptional and iranslafional signals; however, the synthetic genes are linked directly adjacent to the 
translation initiation signal such that th y contain a methionine residue as the amino-terminal amino add. 
This single methionine can be removed from COP-1 polypeptides by cyanogen bromide cleavage. COP-1 
polypeptides with and without this amino-terminal methionine are tested for biological activity. 



Example 6 - Purification of rCOP-1 Polypeptides 

The purification of rCOP-1 polypeptides can be accomplished by a number of methods which are well 
known to tiiose skilled in this art For example. E. coli cells expressing Protein A/rCOP-1 fusion proteins are 
grown in a fermenter, collected by centiifugation. and lysed using a dynamill. The extinct is centiifuged to 
remove debris adjusted to contain 8 M urea, and chromatographed on an S-Sepharose column using a 
sodium chloride gradient for elution. Fractions containing the fusion protein are dialyzed against a solution 
of glycerol in phosphate buffered saline: the dialysate is centrifuged to remove contaminating proteins ttiat 
precipitate during dialysis. The Protein A/rCOP-1 fusion protein is cleaved with cyanogen bromide and the 
rCOP-1 polypeptide is purified by gel filtration and reverse phase HPLC. 



Example 7 - EAE Experiments 

rCOP-1 has been tested for efficacy in suppressing experimental allergic encephalomyelitis (EAE). As 
described above, EAE is a T-cell mediated autoimmune disease that is employed as a model for the human 
disease multiple sclerosis. EAE experiments are performed essentially as described by Swanborg 
(Swanborg RH [1988] "Experimental Allergic Encephalomyelitis." In Methods in Enzymology . vol. 162. p. 
413 Academic Press. Inc.). For example, the disease is Induced in Hartley guinea pigs by a single, 
subcutaneous injection of 10 ug of guinea pig myelin basic protein in Freund's adjuvant containing 100 lig 
of Mycobacterium tuberculosis. Onset of disease occurs about 12 to 20 days after induction. The disease is 
sc ored on a scale of 0-4: 0 = no disease: 1 = loss of coordination In hind limbs: 2 = paralysis of one or 
both hind nmbs: 3 = paralysis extending to include one or boUi front fimbs. can include incontinence of 
bladder or bowel: and 4 = extensive paralysis, inability to move. Animals are scored every 2-3 days from 
the onset of disease, and most animals spontaneously recover from the disease. The b-eatment protocol 
consists of inti^uscular Injections of 500 mg of test material at 1. 6, and 11 days after induction of 
disease The dosage, route of administration, and schedule for treatments can be vaned. Also, EAE 
experiments can be performed in other species including rats and mice. Other variations of the experimen- 
tal protocols and scoring may be used. 

Two rCOP-1 molecules. rCOP-1-77 and rCOP-1-19. have recently been tested in th© EAE expenments. 
The production and purification of these molecules were pertornied in accordance with ttie procedures 
described in Examples 4 and 6. Guinea pigs treated with rCOP-l-T? or rCOP-1-19 were compared to 
animals treated witii myelin basic protein, a positive control, and to an untreated group. The graph In Figure 
13 shows the average EAE scores for each to^atment group versus days after Induction. In Table 1. several 
aspects of disease, such as incidence, day of onset, maximum severity, and duration, are compared. 

Table 1. 



Effects on rCOP-1 and MBP on EAE 


Group Duration' 


Incidence' 


Day of Onsaf 


Max. 


Severity* 


Untreated 

rCOP-1-19 

rCOP-1-77 

Myelin Basic Protein 


7/7 
8/8 
7/8 
8/8 


13.5 i 1.0 (13-15) 
16.8*3.7(13-25) 

17.6 2 2.8 (15-22) 
18.1 £2.6 (15-20) 


2.3 * 0.6 (1-3) 
Z8* 1.4(1-4) 
Z.B s 1.0(1-+) 
^3£1.1 (1-4) 


10.8 a: 3.8(4-15) 
8.6' t 5.2 (2-15) 
9.9*4.5(2-15) 
5.3 ±5.5 (2-1 5) 



' Incidence = Number with disease/number tested. 

2 Values are the mean standard deviation. Ranges are in parentheses. 

3 One animal in this group died. .. 
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Th« msulte can be summarized as follows: rCOP-1-77 and rCOP-1.19 both delayed the onset of disease 
SgTe tr^rm^entTgimen described here, rCOP-1 molecules did not ^^^^^J^.^^ °' 
^r««a.inridanca maximum severity, or duration. MyeOn basic protein delayed onset and decreaseo 

eS^eS isTat the "Lmum'severity for the untreated animals (2.3) is unusually low. a pHot 
eSerimem ^th two guinea pigs, the severity scores were 2 and 4. It may be possible to optm.e the 
effects of rCOP-1 by varying the treatment procedure. 

Fygmole a - Other Applications 1or Random Sequence Polymers of Amino Acids 

Genes encoding random sequence polypeptides composed of other amino adds can synthesized 
usinoTe procedures described herein. Since the properties of the polypeptides are dependent on the 

subsequeSt oSon of cysteine residues to cysteic add. changes that produce ""des^alj e e«e«s on h^r. 
R^ndor^ SauenS^^^^^^ may be able to interact with damaged hair and neutralize these effects 

Hon is as supplsmonts lor diets defident in certain ammo aads. 

\ 

Claims 

, A m.*od for maMna a syittheUe aene encoding a random polymer ot amino adds «<tere said 
polyme? iSs p^^dUmo -eld Jr..«^. aid m«hod compHsln, the wlymenza.on ol small 

o,i3».oc,e^.«dej.p*»es.^ to claim 1. wne^ln the length .. seid synthetic Senas is controll.d t^mt-ah th, 
r=d!'SX » Lr,'r;,?r=:.Tr<ieo.de d..«.s consls, « n~ 

rmrr.^ZT^i™ rre^n l ^ ^ 0..™-= s,.,ct«i 

""TI°m^!r »"cS r.Sn",hTreS pr»pc*n amino a.d in s.d 

-T?=~S"?i"rmT2rs.c^^^ - " 

C.„,a Cf prevenuns. "J^^f ^Jf,: .fJdl.tXC.i^'-a, Is m.l.p. sd.™.i,. 
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predetermined amino acid constituents, said method comprising the synthesis of genes encoding said 
random polymers, said synthesis comprising the polymerization of small oDgonudeotide duplexes, said 
method for making polypeptides further comprising the cloning of said synthetic genes into an expression 
vector and transferring said vector Into a bacteria or eukaryotic coll capalJle of expressing said polypeptide. 
15. A method, according to claim 14. wtierein said bacteria is an Escherichia coll . 
16 A method for making a fusion polypeptide having one portion comprised o1 all or part of a 
heterologous polypeptide and a second portion comprised of a random sequence wherein said random 
sequence portion has predetermined amino add constituents, and said method comprises the synthesis of 
genes encoding said random polymers, said synthesis comprising the polym»nzatlon of small 
oligonucleotide duplexes, said method further comprising the cloning of said synthetic genes mto an 
expression vector adjacent to a DNA sequence encoding all or part of said heterologous polypeptide and 
transferring said vector into a bacteria or eukaryotic cell capable of expressing said fusion polypeptide. 

17. A method for synthesizing and Identifying a biologically or Immunologically active polypeptde. or 
mixture of polypeptides, said method comprising the following steps: 

(a) synthesizing genes encoding random polypeptides by polymerization of small oligonucleotide 

duplexes: each of said synthetic genes into a vector and ti-ansferring said vector into a host capable 

of expressing said polypeptides; 

(c) growing said hosts under conditions which perniit ttie formation of recombinant colonies which 
20 each express one recombinant gene; K=f««. nr 

(d) testing ttie polypeptide, or a mixture of tiie recombinant polypeptides, combined either before or 
after isolation from said colonies, for evidence of biological and/or immunological activity; 

(e) where activity is observed for the mixture of polypeptides, generating smaller subsets of mat 
combination and testing each of these subsets for biological activity: and 

2S (f) repeating step (e) until the active component{s) or a suitable mixture thereof Is obtained. 

18. A synthetic gene which codes for rCOP-1-19. 

19. A synthetic gene which codes for rCOP-1-77. 
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Strategy for Synthesizing Random Genes and 
Identifying Recombinant Polypeptides with Biological Activity 



Figure 1 



mixture ol 

oligonucleotide 

duplexes 



vww 



ligate 



_ _ vN^^^s/N^ 



clone in an expression vector 




VNA^VS/ 

random sequence 
y/S/''^*'^ synthetic genes 



WVN^s^ — 



recombinant colonies, 
each contains one 
, , synthetic gene 



identilicalion oF bbbgicaliy active polypeptides 



V 

pool colonies {e.g.1000) 
y isolate potypeplides 
test lor biological activity 

tradionale inlo smaller pools ol 
colonies (e.g. 10 pools ol 100 
colonies) 



select individual colonies 

I • 

isolate the polypeptide 
test lor biological activity 
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Oligonucleotides 
{9-mers) 



anneal 



Duplexes 



ligate 



Synthetic 
Genes 




expression 



Polypeptides 



Ala-Ala-Ala-Lys-Lys-Lys-Ala-Ala-Ala-Lys-Lys-Lys-Ala-Ala-Ala 
Ala-Ala-Ala-Ala-Ala-Ala-Lys-Lys-Lys-Lys-Lys-Lys-Lys-Lys-Lys 
5-Lys-Lys-Ala-Ala-Ala-Ala-Ala-Ala-Lys-Lys-Lys-Ala-Ala-Ala 



Lys- 



x* 



x 
•x 



. Ala Ala Ala 
■ Lys Lys Lys 
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Figure 3 



All Possible 3 Amino Acid Combinations 
and Their Percent Occurence in Cop 1 





7.B72 


AEA 




ANA 


6 , 560 


AYA 


1.312 


EAA 


2 . 624 


EEA 


0.875 


EKA 


2 . 107 


t 1 A 


0 . 4 i7 


KAA 


6 . 5^.0 


KEA 


2 . 1U7 


KKA 


5. 466 


KYA . 


1 .093 


YAA 


1.31^ 


YEA 


0 . 437 


YKA 


1 .093 


•f YA 


0.219 



AAE 


2.624 


AEE 


0.S75 


AKE 


2. 187 


AYE 


0 .437 


EAE 


0.875 


EEE 


0.292 


EKE 


0 .729 


EYE 


0.146 


KAE 


2. 187 


KEE 


0.729 


KKE 


1 . B22 


KYE 


0.364 


YAE 


0,437 


YEE 


0.146 


YKE 


0.364 


YYE 


0. 073 



AAK 


6 . 560 


AEK 


2. 107 


AKK 


5.466 


AYK 


1 . 093 


EAK 


2. 187 


EEK 


0 .729 


EKK 


1 . 822 


EYK 


0.364 


KAK 


5. 466 


KEK 


1 . 822 


KKK 


4.555 


KYK 


0.911 


YAK 


1 . 093 


YEK 


0 . 364 


YKK 


0.911 


YYK 


0 . 1EJ2 



AAY 


1.312 


AEY 


0.437 


AKY 


1 .093 


AYY 


0.219 


EAY \ 


0 .437 


EEY 


0.146 


EKY 


0.364 


EYY 


0 .073 


KAY 


1 .093 


KEY 


0 .364 


KKY 


0.911 


KYY 


0 . 182 


YAY 


0.219 


YEY 


0.073 


YKY 


0. 182 


YYY 


0 • 036 



A=Alanine 
E=Glutamic Acid 
K=Lysine 
Y=Tvrosine 
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Figure A 

9-Nucieotide Duplexes and Adaptors 
for Cop 1 Gene Synthesis 



Examples of g-Nucleotide Duplexes for Cop 1 Genes: 

® 

coding (S-end, IaGAAGGCA GAAGCAGAA (J'.nd) 

Noncodlng (3- end) T T T C T T C C G T C T T C 6 T C T P '"d) 

® ® 
Amino acids ^ n « 



Adaptors: 



. ^. -i GATG- 
For the 5' end 

(BamH 1) 



© 

For the 3' end ' A^^"^- 



(Sac I) 
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Figure 5 



Construction and Cloning 
Synthetic COP 1 Genes 



(oh) 



5' adaptor 



GATC- 
>^ . 

(BamH I) 



9-mers 



GATC- 
(BamH iT 



ligase 



. A A A -A . 

-T T T- 



3" adaptor 



• AGCT 




(Sac 1) 



•AGCT^ 



(Sac I) 



size-select genes 



GATC- 



clone into vector 



■ AGCT 



Synthetic Gene 



Plasmid Vector 

CTAG TOGA A cut with BamH I 

/ I and Sac I 
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Figure 6 

Sequence Analysis of a 
Synthetic Gene 



amino acids: Y KKEAEK AK 

nucleotides: T A C A A G A A a|3 A A G C A G A A G G C T A A A | 

amino acids: Y KKY KKK ^ ^ 

nucleotides: T A C A A G A A A[r A C A A G A A A [\ A G A A G G C A | 



amino acids 



K K 



nucleotides: G A A G C A G A AjA-A G A A G G C a{3 A A G C A G A a] 

amino adds: KKAEAEE AE 

nucleotides: A A G A A G G C a|g A A G C A G A a|3 A A G C A G A A [ 

amino acids: K KAKAKEAE 

nucleotides: A A G A A G G C AjA A G G CT A A a|3 A A G C A G A A ] 

amino acids: K KAKKAKAK 

nucleotides": A A G A A G G C a|a A G A A G G C a}^ A G G C T A A A j 



The lines in the nucleotide sequence mark the junctions 
between g-mer duplexes. 



Amino Acids 
A= alanine 
Kolysine 
E=gluuuntc ocid 
Y=iyrosinc 
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Figure 8 

Variations of Random-Gene Synthesis 
using Small Duplexes 



One Nucleotide Extension: 

3-mers 
3-mers 



GCAAAA 

6-mers JCGTTT TCTTCG 



I'Dn.orc GCAGAAGCAAAA 
12-mers TCGTCTTCGTTT 



Three Nucleotide Extensions. 12-mer Duplexes: 



jjLipIex XXXXXXXXXGCA 

CGTXXXXXXXXX 



i 



ligate 



aene xxxxxxxxxgcaxxxxxxxxxgcaxxxxxxxxxgca 

^ CGTXXXXXXXXXCGTXXXXXXXXXCGTXXXXXXXXX 



i 



polypeptide Ala aa aa aa Ala aa aa aa Ala aa aa aaa Ala 



XXX- any codon 

aa=» amino acid specified 

by codon xxx 
Aiao alanine 
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Figure 9 



Synthesis of Single-Stranded DNA Encoding 
Random Sequence Amino Add Polymers 



3 Nucleotide Segments for COP I 

Ala 



ratio: 



GIu 
^GAA^' 



Lys 

s'aaa- 

5 



3' 



Tyr 
^TAC^' 



Polymerize 

A. Solution Chemistry 
°^ B. DNA Synthesizer 



Single-Stranded DNA 



Double-Stranded DNA 



5' 



5' 
3' 



-3' 



Enzymatic Synthesis 
of the Second Strand 



.3' 
.5' 



Clone into 
Expression Vector 
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Figure 10 



DMTr 



Phosphoramldlto Trinucleotide for Random 
Gene Synthesis Using a DMA Synthesizer 




Phosphoramidite 
Trinucleotide 



o=p— o 




CN-CH2-CH2O N— CH(CH3)2 
CH (CH 3) 2 



\ 



DMTr 



Phosphoramidite 
Mononucleotide 




CH 3 O N — CH (CH 3) 2 
CH (CH3)2 



EP 0 383 620 A2 



FIGURE 11 
rCOP-1-77 

* ^ E V K K K X K E A E K A K K K Y K K 

V K K E A L X A K A X K * * ^ " " ^ 

,^=AACCA=AACCT==TC=A=AA=«=«^=JCTAA«A=^^ 

+ + """" „„,«T,or-r!iTTTiTGTTCTTTTTCC(iA 

TTTCTTCGTCTTCGACGA' 



CG;;M«;c;ScMxirTATGTTCraTTTCC(^ATTTCIT 
AEAEKAKYKKKA 



C-CAC-fJ.TACAAGA«AAGGCTAAAGCT=CT=™«GAATA«^C^^^ 

E . K K K A A A A E A B Y K K E X E 

GAXGCAGAXTA=XAGAAATACXA=AAAAAGGCTAAA«GG«^^ACAXGA^^ 

-ci;^s^si;;;i™i™src-G;;TTiT==GA™TTc^^^ 

H , E Y K K . K K K A. K K A K Y K K K A 
AAXGXAG=AGXAXAGGCTAXAGCTGCTGCAGAXG=XGX,^GC-CT>^=^^^^ 

K E A E K A K X A A E A E X A K E A E Y 

AAGAXATXCAXGXAAGAAG«GAAAAGGCTAAAGXAGCAGAAT^^ 
;;;;5lTCncii?Si«iS^TTCCGATTTCTTCGTCTTATr 



K K Y 



E A E K A 
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FIGURE 12 
rCOP-1-19 



C0.T.C=O.CC-TCX=.TTC=^ , . K K . . ^ > ^ * ^ " 

A K A A ^ ^.^^'-rCAGCTGCTGAA 

GGAGAGAJJLGCAAAGGAAGCKyiGAAGG^A^^ %-J=Z~^-^T=rTrl 

k\ . . . K . . = K . K X . V K K . K 

GCGAAAGCAAAGGCTGCATAA 
. CGCTTTCGTTTCCGACGTATT 
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EAE Score 




